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DETECTION METHODS 



Cross Reference to Related Applications 

This application claims priority to United States provisional patent 
5 application numbers 60/413,583 filed September 25, 2002, and 60/491 ,842 filed 
August 1, 2003; the disclosures of which are incorporated herein by reference in 
their entirety. 

Field of Invention 

10 The present invention relates to single nucleotide polymorphisms in 

nucleic acids involved in encoding enzymes in the testosterone biosynthetic 
pathway and to methods for detecting such polymorphisms. The invention has 
utility in the diagnosis, prognosis, prevention and treatment of disease, 
particularly those relating to prostate cancer and breast cancer. 

15 

Background of the Invention 

Prostate cancer is the most common non-skin cancer in males all over the 
world. Currently, there are no means to predict how aggressive an individual's 
cancer will be. Thus, many patients are given unnecessary drastic treatment with 
20 severe side effects and possibly others do not receive treatment effective 
enough. 

Incidence of prostate cancer shows strong age dependence, being a 
disease of old men, and strong race dependence, being almost twice as common 
in African Americans as in Caucasians, while Asian populations have the lowest 

25 risk (Cook et al. (1999) J Urol 161, 152-155; Hsing et al. (2000) Int J Cancer 85, 
60-67). The third well-known risk factor is having a family history of prostate 
cancer (Cerhan et al.(1999) Cancer Epidemiol Biomarkers Prev 8, 53-60; Kalish 
et al. (2000) Urology 56, 803-806), and several studies have supported the 
presence of predisposing genetic factors. 

30 Genome wide linkage analyses have pointed multiple chromosomal 

regions showing linkage in prostate cancer families and several prostate cancer 
candidate loci have been suggested; HPC1 in 1q24 (Smith et al. (1996) Science 
274, 1371-1374), HPCX in Xq27 (Xu et al. (1998) Nat Genet 20, 175-179), PCAP 
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in 1q42.2 (Berthon et al. (1998) Am J Hum Genet 62, 1416-1424), CABP in 1p36 
(Gibbs et al. (1999) Am J Hum Gen 64, 776-787), and HPC2/ELAC2 in 17p 
(Tavtigian et al. (2001) Nat Genet 27, 172-180). Recently, a candidate cancer- 
susceptibility gene, RNASEL, was cloned at the HPC1 loci (Carpten et al. (2002) 
5 Nat Genet 30, 181-184) and two possibly deleterious germline mutations 
segregating in prostate cancer families were discovered. 

The growth of prostate cells is dependent on active testosterone (Ekman 
(1995) J Urol 101, 22-25) and strikingly, prostate adenocarcinomas can be 
created by testosterone administration in rats (Gupta et al. (1999) Cancer Res 
10 59, 21 15-2120). Testosterone seems to be a strong tumour promoter for the rat 
prostate, even at doses that do not measurably increase circulating testosterone 
(Bosland et al. (1991) Princess Takamatsu Symp 22, 109-123). Consequently, 
genes involved in the testosterone biosynthetic pathway, e.g., CYP17, CYP3A4, 
and SRD5A2 (Figure 1) are good candidates for being involved in the initiation 
and progression of prostate cancer. Several polymorphisms have been 
discovered in these genes and some of them show association either with 
increased risk or progression of prostate cancer (Table 1). Nevertheless, there is 
no evidence of higher testosterone levels in prostate cancer patients. 

Approximately 55 different Cytochrome P450 genes are present in the 
human genome and are classified into different families and subfamilies on the 
basis of sequence homology. Members of the CYP3A subfamily catalyze the 
oxidative, peroxidative and reductive metabolism of different endobiotics, drugs, 
and protoxic or procarcinogenic molecules. As an example, CYP3A4 is 
responsible for the oxidative metabolism of an estimated 60% of all clinically used 
drugs. Up to 30-fold interindividual differences in expression has been detected, 
causing variation in oral bioavailability and systemic clearance of CYP3A 
substrates, such as HIV protease inhibitors, several calcium channel blockers 
and some cholesterol-lowering drugs. Variation in CYP3A expression is 
particularly important in substrates with narrow therapeutic indices, such as 
cancer chemotherapeutics and immunosuppressants. Variation in CYP3A 
expression can result in clinically significant differences in drug toxicities and 
response. 
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As with prostate cancer, breast cancer also shows age-dependency 
indicating a possible hormonal influence on the disease risk. Endogenous 
oestradiol synthesis takes place in the ovarian theca cells of pre-menopausal 
women, in the stromal adipose cells of the breast of postmenopausal women, 
5 and in minor quantities in peripheral tissue. These cells, as well as breast cancer 
tissue, express all the necessary enzymes for this synthesis, including CYP17, 
and enzymes that further hydroxylate oestradiol, such as CYP3A4 (Kristensen et 
al. (2000) Mutat Res 462, 323-333). Thus, polymorphisms in these enzymes may 
also be associated with the risk of breast cancer (Kristensen et al. (2000) Mutat 

10 Res 462, 323-333). Furthermore, CYP3A4 is also involved in the activation of 
many mammary carcinogens, such as the polycyclic aromatic hydrocarbons and 
heterocyclic amines (Guengerich et al. (1991) Chem Res Toxicol. 4, 168-179). 
According to a recent study (Zheng et al. (2001) Cancer Epidemiol Biomarkers 
Prev 10, 237-242), high CYP3A4 activity may be a risk factor for breast cancer 

is risk. 

Single nucleotide polymorphisms (SNPs) are the most common type of 
genetic variation in the human genome, and are expected to be helpful in 
identifying human disease genes. In addition to occurring frequently, on average 
every 500-2,000 bp (Li & Sadler (1991) Genetics 129, 513-523; Chakravarti 

20 (1998) Nat Genet 19, 216-217; Cargill et al. (1999) Nat Genet 22, 231-238; 
Halushka et al. (1999) Nat Genet 22, 239-247), SNPs have a low mutation rate 
when compared to microsatellite markers, both of which are characteristics that 
may have particular advantages for association analysis. The utility of SNPs is 
not only in their use as markers for discovering additional functional variants and 

25 for the general evaluation of a specific gene in the context of a given clinical 
phenotype but also in their potential functional relevance. However, rather than 
finding a single SNP with drastic effect on the phenotype, more likely it will be 
multiple SNPs in relevant genes, either linked (i.e., grouped as a haplotype) or 
independent (perhaps on different chromosomes), that contribute to the 

30 phenotype. 

Recently, several studies have shown the utility of haplotypes, i.e., a 
combination of SNPs with alleles physically assigned to a chromosome, in 
association analysis (Daly et al. (2001) Nat Genet 29, 229-232). Studying 



WO 2004/028346 




PCT/US2003/030359 



haplotypes might give the analysis more power but traditionally demands either 
samples from multiple generations or tedious molecular haplotyping. 
Alternatively, several algorithms have been developed for inferring haplotypes 
from genotype data (Clark (1990) Mol Biol Evol 7, 1 1 1-122; Excoffier & Slatkin 

5 (1995) Mol Biol Evol 12, 921-927; Stephens et al. (2001) Am J Hum Genet 68, 
978-989). These algorithms have been proven to work with a very low error rate 
(Drysdale et al. (2000) PNAS 97, 10483-10488). In a sense, haplotyping is 
equivalent to performing a study in a family or other select group of people. It 
helps to get back the power of linkage, and can be regarded as a crucial step in 

10 association studies using random individuals. 

WO02/055735 discloses specific nucleic acids useful for identifying, 
diagnosing, monitoring, staging, imaging and treating prostate cancer and breast 
cancer. Similar compositions comprising prostate specific nucleic acids are 
described by the same applicant (Diadexus Inc.) in related applications 

15 (WO02/42776, WO02/42499, WO02/42463, WO02/42329, WO02/39431 , 

WO02/239431, WO02/38810, WO02/38810, WO02/236808 and WO0224718). 

Diadexus Inc. have also disclosed a method of diagnosing, monitoring, 
staging, imaging and treating prostate and breast cancer by means of specific 
nucleic acids, in a series of related applications (WO01/39798 & WO00/2311 1 & 

20 WO00/23108). 

WO01/53537 (DZ Genes Inc.) describes isolated polynucleotides 
containing at least one polymorphism useful for the diagnosis of disease, 
particularly prostate and breast cancer. 

Single nucleotide polymorphisms associated with prostate cancer are 

25 disclosed in WO01/83828, as are methods for using these SNPs to determine 
susceptibility to this disease. 

In order to improve the lives of prostate and breast cancer patients it is 
essential to develop prognostic markers for cancer as well as markers allowing 
general assessment of disease risk. Patients need to be categorized into those 

30 needing immediate, extensive treatment, and those who just need watchful 
waiting. As a result, prostate and breast cancer mortality could be reduced and 
unnecessary side effects caused by invasive treatments could be avoided. There 
is therefore a need for prognostic molecular markers for aggressive breast and 
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prostate cancer to aid predicting, diagnosing and monitoring these diseases in 
individuals. Furthermore, there is a continued need for improved methods of 
treatment of both conditions in patients. The present invention addresses these 
needs and provides improvements over the prior art in the form of novel and 
5 specific nucleic acids, microarrays and kits useful for the diagnosis of breast and 
prostate cancer. 

Summary of the Invention 

According to the first aspect of the present invention, there is provided an 

10 isolated polynucleotide selected from the group consisting of a nucleotide 

sequence comprising one or more polymorphic sequences of SEQ ID NOS 1-34. 
Suitably, a fragment of the isolated polynucleotide comprises a polymorphic site 
in the polymorphic sequence. 

In a second aspect of the present invention, there is provided an isolated 

is polynucleotide comprising a sequence complementary to one or more of the 
polymorphic sequences of SEQ ID NOS 1-34. Suitably, a fragment of the 
complementary nucleotide sequence comprises a polymorphic site in the 
polymorphic sequence. 

Preferably, the polynucleotides of the first and second aspect comprise 

20 DNA, RNA, cDNA, or mRNA 

Preferably, at least one single nucleotide polymorphism of the isolated 
polynucleotide is at a position selected from the group consisting of position 
[CYP3A4JVS9 +187] of SEQ ID No. 1, position [CYP3A4, 1639 base pairs after 
the stop codon] of SEQ ID No. 2, position [CYP3A4, 945 base pairs after the stop 

25 codon] of SEQ ID No. 3, position [CYP3A4_5 J region -747] of SEQ ID No. 4, 
position [CYP3A4_ IVS7 -202] of SEQ ID No. 5, position [CYP3A4, 2204 base 
pairs after the stop codon] of SEQ ID No. 6, position [CYP3A4_ IVS2 -132] of 
SEQ ID No. 7, position [CYP3A4JVS1 -868] of SEQ ID No. 8, position 
[CYP3A4_5' region -847] of SEQ ID No. 9, position [CYP3A4, 766 base pairs 

30 after the stop codon] of SEQ ID No. 10, position [CYP3A4, 1454 base pairs after 
the stop codon] of SEQ ID No. 1 1, position [CYP3A4JVS3 +1992] of SEQ ID No. 
12, position [CYP3A4JVS9 +841] of SEQ ID No. 13, position [CYP3A4JVS12 - 
473] of SEQ ID No. 14, position [CYP3A4JVS12 +581] of SEQ ID No. 15, 



WO 2004/028346 




PCT/US2003/030359 



position [CYP3A4JVS12 +586] of SEQ ID No. 16, position [CYP3A4JVS12 
+646] of SEQ ID No. 17, position [CYP3A4JVS3 -734] of SEQ ID No. 18, 
position [CYP17JVS1 -271] of SEQ ID No. 19, position [CYP17JVS5 +76] of 
SEQ ID No. 20, position [CYP17JVS1 +426] of SEQ ID No. 21, position 
5 [CYP17JVS1 -99] of SEQ ID No. 22, position [CYP17JVS1 -700] of SEQ ID No. 
23, position [CYP17JVS1 -565] of SEQ ID No. 24, position [CYP17JVS3 +141] 
of SEQ ID No. 25, position [CYP17_5* region -1488] of SEQ ID No. 26, position 
[CYP17_5' region -1204] of SEQ ID No. 27, position [CYP17JVS1 +466] of SEQ 
ID No. 28, position [CYP17, 712 base pairs after the stop codon] of SEQ ID No. 
10 29, position [SRD5A2, 1356 base pairs after the stop codon (3' UTR)] of SEQ ID 
No. 30, position [SRD5A2, 849 base pairs after the stop codon (3* UTR)] of SEQ 
ID No. 31, position [SRD5A2_5' region -870] of SEQ ID No. 32, position 
[SRD5A2_5' region between -2036 and -2030] of SEQ ID No. 33, and position 
[SRD5A2, 545 base pairs after the stop codon (3* UTR)] of SEQ ID No. 34. 

More preferably, at least one single nucleotide polymorphism is selected 
from the group consisting of [CYP3A4JVS9 +187C>G] of SEQ ID No. 1, 
[CYP3A4, 1639 base pairs after the stop codon, A>T] of SEQ ID No. 2, [CYP3A4, 
945 base pairs after the stop codon, A>T] of SEQ ID No. 3, [CYP3A4_5' region - 
747C>G] of SEQ ID No. 4, [CYP3A4_ IVS7 -202OT] of SEQ ID No. 5, [CYP3A4, 
2204 base pairs after the stop codon, G>C] of SEQ ID No. 6, [CYP3A4_ IVS2 - 
132C>T] of SEQ ID No. 7, [CYP3A4JVS1 -868OT] of SEQ ID No. 8, 
[CYP3A4_5' region -847A>T] of SEQ ID No. 9, [CYP3A4, 766 base pairs after 
the stop codon, delT] of SEQ ID No. 10, [CYP3A4, 1454 base pairs after the stop 
codon, C>T] of SEQ ID No. 11, [CYP3A4JVS3 +1992T>C] of SEQ ID No. 12, 
[CYP3A4JVS9 +841T>G] of SEQ ID No. 13, [CYP3A4JVS12 -473T>G] of SEQ 
ID No. 14, [CYP3A4JVS12 +581C>T] of SEQ ID No. 15, [CYP3A4JVS12 
+586G>A] of SEQ ID No. 16, [CYP3A4JVS12 +6460A] of SEQ ID No. 17, 
[CYP3A4JVS3 -734G>A] of SEQ ID No. 18, [CYP17JVS1 -271A>C] of SEQ ID 
No. 19, [CYP17JVS5 +75C>G] of SEQ ID No. 20, [CYP17JVS1 +426G>A] of 
SEQ ID No. 21, [CYP17JVS1 -99C>T] of SEQ ID No. 22, [CYP17JVS1 - 
700OG] of SEQ ID No. 23, [CYP17JVS1 -565G>A] of SEQ ID No. 24, 
[CYP17JVS3 +141A>T] of SEQ ID No. 25, [CYP17_5' region -1488C>G] of SEQ 
ID No. 26, [CYP17_5' region -1204OT] of SEQ ID No. 27, [CYP17JVS1 
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+466G>A] of SEQ ID No. 28, [CYP17, 712 base pairs after the stop codon, G>A] 
of SEQ ID No. 29, [SRD5A2, 1356 base pairs after the stop codon (3' UTR), A>C] 
of SEQ ID No. 30, [SRD5A2, 849 base pairs after the stop codon (3' UTR), A>G] 
of SEQ ID No. 31, [SRD5A2_5' region -870G>A] of SEQ ID No. 32, [SRD5A2_5' 
5 region -2036(A)7-8] of SEQ ID No. 33, and [SRD5A2, 545 base pairs after the 
stop codon (3* UTR), T>C] of SEQ ID No. 34. 

Optionally, the polynucleotide is the complement of any of the isolated 
polynucleotides hereinbefore described. 

In one aspect, the polynucleotide comprises part of the CYP17 gene, the 
CYP3A4 gene or the SRD5A2 gene. 

Preferably, the isolated polynucleotide further comprises a detectable 
label. More preferably, the detectable label is selected from the group consisting 
of fluorophore, radionuclide, peptide, enzyme, antibody and antigen. In a 
preferred embodiment, the fluorophore is a fluorescent compound selected from 
the group consisting of Hoechst 33342, Cy2, Cy3, Cy5, CypHer, coumarin, FITC, 
DAPI, Alexa 633, DRAQ5 and Alexa 488. 

In a third aspect of the present invention, there is provided a method for 
diagnosing a genetic susceptibility for a disease, condition or disorder related to 
prostate or breast cancer in a subject, the method comprising analysing a 
biological sample containing nucleic acid obtained from the subject to detect the 
presence or absence of one or more single nucleotide polymorphisms at a 
position selected from the group consisting of position [CYP3A4JVS9 +187] of 
SEQ ID No. 1, position [CYP3A4, 1639 base pairs after the stop codon] of SEQ 
ID No. 2, position [CYP3A4, 945 base pairs after the stop codon] of SEQ ID No. 
3, position [CYP3A4_5 I region -747] of SEQ ID No. 4, position [CYP3A4_ IVS7 - 
202] of SEQ ID No. 5, position [CYP3A4, 2204 base pairs after the stop codon] of 
SEQ ID No. 6, position [CYP3A4__ IVS2 -132] of SEQ ID No. 7, position 
[CYP3A4JVS1 -868] of SEQ ID No. 8, position [CYP3A4_5' region -847] of SEQ 
ID No. 9, position [CYP3A4, 766 base pairs after the stop codon] of SEQ ID No. 
10, position [CYP3A4, 1454 base pairs after the stop codon] of SEQ ID No. 11, 
position [CYP3A4JVS3 +1992] of SEQ ID No. 12, position [CYP3A4JVS9 +841] 
of SEQ ID No. 13, position [CYP3A4JVS12 -473] of SEQ ID No. 14, position 
[CYP3A4JVS12 +581] of SEQ ID No. 15, position [CYP3A4JVS12 +586] of 
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SEQ ID No. 16, position [CYP3A4JVS12 +646] of SEQ ID No. 17, position 
[CYP3A4JVS3 -734] of SEQ ID No. 18, position [CYP17JVS1 -271] of SEQ ID 
No. 19, position [CYP17JVS5 +75] of SEQ ID No. 20, position [CYP17JVS1 
+426] of SEQ ID No. 21, position [CYP17JVS1 -99] of SEQ ID No. 22, position 

5 [CYP1 7JVS1 -700] of SEQ ID No. 23, position [CYP17JVS1 -565] of SEQ ID 
No. 24, position [CYP17JVS3 +141] of SEQ ID No. 25, position [CYP17_5* 
region -1488] of SEQ ID No. 26, position [CYP17_5' region -1204] of SEQ ID No. 
27, position [CYP17JVS1 +466] of SEQ ID No. 28, position [CYP17, 712 base 
pairs after the stop codon] of SEQ ID No. 29, position [SRD5A2, 1356 base pairs 

10 after the stop codon (3' UTR)] of SEQ ID No. 30, position [SRD5A2, 849 base 
pairs after the stop codon (3' UTR)] of SEQ ID No. 31, position [SRD5A2_5' 
region -870] of SEQ ID No. 32, position [SRD5A2_5' region between -2036 and - 
2030] of SEQ ID No. 33, position [SRD5A2, 545 base pairs after the stop codon 
(3' UTR)] of SEQ ID No. 34, position [SRD5A2JVS2+626] of SEQ ID No. 35, 

15 position [SRD5A2_5* region -8029] of SEQ ID No. 36, position 

[CYP3A4JVS7+34] of SEQ ID No. 42, position [CYP3A4_5* region -1232] of 
SEQ ID No. 43, position [SRD5A2_5* region -3001] of SEQ ID No. 44, and 
position [SRD5A2, 1552 base pairs after the stop codon] of SEQ ID No. 45. 
Suitably, the nucleic acid is DNA, RNA, cDNA or mRNA. 

20 Preferably, the single nucleotide polymorphism is selected from the group 

consisting of [CYP3A4JVS9 +1870G] of SEQ ID No. 1, [CYP3A4, 1639 base 
pairs after the stop codon, A>T] of SEQ ID No. 2, [CYP3A4, 945 base pairs after 
the stop codon, A>T] of SEQ ID No. 3, [CYP3A4_5' region -7470G] of SEQ ID 
No. 4, [CYP3A4_ IVS7 -202OT] of SEQ ID No. 5, [CYP3A4, 2204 base pairs 

25 after the stop codon, G>C] of SEQ ID No. 6, [CYP3A4_ IVS2 -1320T] of SEQ 
ID No. 7, [CYP3A4JVS1 -868OT] of SEQ ID No. 8, [CYP3A4_5' region - 
847A>T] of SEQ ID No. 9, [CYP3A4, 766 base pairs after the stop codon, delT] 
of SEQ ID No. 10, [CYP3A4, 1454 base pairs after the stop codon, OT] of SEQ 
ID No. 11, [CYP3A4JVS3 +1992T>C] of SEQ ID No. 12, [CYP3A4JVS9 

30 +841T>G] of SEQ ID No. 13, [CYP3A4JVS12 -473T>G] of SEQ ID No. 14, 
[CYP3A4JVS12 +581C>T] of SEQ ID No. 15, [CYP3A4JVS12 +586G>A] of 
SEQ ID No. 16, [CYP3A4JVS12 +6460A] of SEQ ID No. 17, [CYP3A4JVS3 - 
734G>A] of SEQ ID No. 18, [CYP17JVS1 -271A>C] of SEQ ID No. 19, 
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[CYP17JVS5 +750G] of SEQ ID No. 20, [CYP17JVS1 +426G>A] of SEQ ID 
No. 21, [CYP17JVS1 -990T] of SEQ ID No. 22, [CYP17JVS1 -700OG] of 
SEQ ID No. 23, [CYP17JVS1 -565G>A] of SEQ ID No. 24, [CYP17JVS3 
+141A>T] of SEQ ID No. 25, [CYP17_5' region -14880G] of SEQ ID No. 26, 
5 [CYP1 7_5' region -1 204OT] of SEQ ID No. 27, [CYP1 7JVS1 +466G>A] of 
SEQ ID No. 28, [CYP17, 712 base pairs after the stop codon, G>A] of SEQ ID 
No. 29, [SRD5A2, 1356 base pairs after the stop codon (3' UTR), A>C] of SEQ ID 
No. 30, [SRD5A2, 849 base pairs after the stop codon (3* UTR), A>G] of SEQ ID 
No. 31, [SRD5A2_5' region -870G>A] of SEQ ID No. 32, [SRD5A2_5' region - 
2036(A)7-8] of SEQ ID No. 33, [SRD5A2, 545 base pairs after the stop codon (3' 
UTR), T>C] of SEQ ID No. 34, [SRD5A2_IVS2+626C>T] of SEQ ID No. 35, 
[SRD5A2_5' region -8029OT] of SEQ ID No. 36, [CYP3A4_IVS7+34T>G] of 
SEQ ID No. 42, [CYP3A4_5" region -12320T] of SEQ ID No. 43, SRD5A2_5' 
region -3001G>A] of SEQ ID No. 44, and [SRD5A2, 1552 base pairs after the 
stop codon, G>A] of SEQ ID No.45. 

Optionally, the single nucleotide polymorphism is selected from the 
complement of any of the single nucleotide polymorphisms described 
hereinbefore. 

Suitably, the analysis is accomplished by sequencing, genotyping, 
fragment analysis, hybridisation, restriction fragment analysis, oligonucleotide 
ligation or allele specific PCR. Preferably, the analysis is accomplished by 
hybridisation, the method comprising the steps of 

i) contacting the nucleic acid with an oligonucleotide that hybridises to 
one or more isolated polynucleotide polymorphic sequence selected 
from the group consisting of SEQ ID NOS 1-36, 42-45 or its 
complement 

ii) determining whether the nucleic acid and the oligonucleotide 
hybridize; 

whereby hybridisation of the nucleic acid to the oligonucleotide indicates the 
presence of the polymorphic site in the nucleic acid. 

In a fourth aspect of the present invention, there is provided a method for 
diagnosing a genetic susceptibility for a disease, condition or disorder related to 
prostate or breast cancer in a subject, or predicting an individual's response to a 
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drug, the method comprising adding an antibody to a polypeptide present in a 
biological sample obtained from the subject which polypeptide is encoded by a 
polynucleotide selected from the group consisting of SEQ ID NOS 1-36 and SEQ 
ID NOS 42-45, or the complement thereof, and detecting specific binding of the 
5 antibody to the polypeptide. 

In a fifth aspect of the present invention, there is provided a kit comprising 
at least one isolated polynucleotide of at least 5 contiguous nucleotides of SEQ 
ID NOS: 1-36 or 42-45, or the complement thereof, and containing at least one 
single nucleotide polymorphic site associated with a disease, condition or 
10 disorder related to prostate or breast cancer together with instructions for the use 
thereof for detecting the presence or the absence of said at least single 
nucleotide polymorphism in said nucleic acid. 

In a sixth aspect of the present invention, there is provided an 
oligonucleotide array comprising at least one oligonucleotide capable of 
15 hybridising to a first polynucleotide at a polymorphic site encompassed therein, 
wherein the first polynucleotide comprises a nucleotide sequence comprising one 
or more polymorphic sequences of SEQ ID NOS: 1-36 and SEQ ID NOS: 42-45. 

Suitably, the first polynucleotide comprises a fragment of any of the 
nucleotide sequences, the fragment comprising a polymorphic site in the 
20 polymorphic sequence. 

Suitably, the first polynucleotide is a complementary nucleotide sequence 
comprising a sequence complementary to one or more polymorphic sequences of 
SEQ ID NOS: 1-36 and SEQ ID NOS: 42-45. 

Suitably, the first polynucleotide comprises a fragment of said 
25 complementary sequence, the fragment comprising a polymorphic site in the 
polymorphic sequence. 

Suitably, the position of the polymorphic site in the kit or the microarray as 
hereinbefore described is at a position selected from the group consisting of 
position [CYP3A4JVS9 +187] of SEQ ID No. 1, position [CYP3A4, 1639 base 
30 pairs after the stop codon] of SEQ ID No. 2, position [CYP3A4, 945 base pairs 
after the stop codon] of SEQ ID No. 3, position [CYP3A4_5* region -747] of SEQ 
ID No. 4, position [CYP3A4_ IVS7 -202] of SEQ ID No. 5, position [CYP3A4, 
2204 base pairs after the stop codon] of SEQ ID No. 6, position [CYP3A4_ IVS2 - 
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132] of SEQ ID No. 7, position [CYP3A4JVS1 -868] of SEQ ID No. 8, position 
[CYP3A4_5' region -847] of SEQ ID No. 9, position [CYP3A4, 766 base pairs 
after the stop codon] of SEQ ID No. 10, position [CYP3A4, 1454 base pairs after 
the stop codon] of SEQ ID No. 11, position [CYP3A4JVS3 +1992] of SEQ ID No. 

5 1 2, position [CYP3A4JVS9 +841 ] of SEQ I D No. 1 3, position [CYP3A4JVS1 2 - 
473] of SEQ ID No. 14, position [CYP3A4JVS12 +581] of SEQ ID No. 15, 
position [CYP3A4JVS12 +586] of SEQ ID No. 16, position [CYP3A4JVS12 
+646] of SEQ ID No. 17, position [CYP3A4JVS3 -734] of SEQ ID No. 18, 
position [CYP17JVS1 -271] of SEQ ID No. 19, position [CYP17JVS5 +75] of 

io SEQ ID No. 20, position [CYP17JVS1 +426] of SEQ ID No. 21 , position 

[CYP17JVS1 -99] of SEQ ID No. 22, position [CYP17JVS1 -700] of SEQ ID No. 
23, position [CYP17JVS1 -565] of SEQ ID No. 24, position [CYP17JVS3 +141] 
of SEQ ID No. 25, position [CYP17_5" region -1488] of SEQ ID No. 26, position 
[CYP17_5' region -1204] of SEQ ID No. 27, position [CYP17JVS1 +466] of SEQ 

is ID No. 28, position [CYP17, 712 base pairs after the stop codon] of SEQ ID No. 
29, position [SRD5A2, 1356 base pairs after the stop codon (3' UTR)] of SEQ ID 
No. 30, position [SRD5A2, 849 base pairs after the stop codon (3' UTR)] of SEQ 
ID No. 31, position [SRD5A2_5' region -870] of SEQ ID No. 32, position 
[SRD5A2_5' region between -2036 and -2030] of SEQ ID No. 33, position . 

20\ [SRD5A2, 545 base pairs after the stop codon (3' UTR)] of SEQ ID No. 

34.position [SRD5A2JVS2+626] of SEQ ID No. 35, position [SRD5A2_5* region - 
8029] of SEQ ID No. 36, position [CYP3A4JVS7+34] of SEQ ID No. 42, position 
[CYP3A4_5' region -1232] of SEQ ID No. 43, position [SRD5A2_5' region -3001] 
of SEQ ID No. 44 and position [SRD5A2, 1552 base pairs after the stop codon] of 

25 SEQ ID No. 45. 

Preferably, at least one single nucleotide polymorphism is selected from 
the group consisting of [CYP3A4JVS9 +187C>G] of SEQ ID No. 1 , [CYP3A4, 
1639 base pairs after the stop codon, A>T] of SEQ ID No. 2, [CYP3A4, 945 base 
pairs after the stop codon, A>T] of SEQ ID No. 3, [CYP3A4_5' region -747C>G] 

30 of SEQ ID No. 4, [CYP3A4_ IVS7 -202OT] of SEQ ID No. 5, [CYP3A4, 2204 
base pairs after the stop codon, G>C] of SEQ ID No. 6, [CYP3A4_ IVS2 - 
132C>T] of SEQ ID No. 7, [CYP3A4JVS1 -868C>T] of SEQ ID No. 8, 
[CYP3A4_5* region -847A>T] of SEQ ID No. 9, [CYP3A4, 766 base pairs after 
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the stop codon, delT] of SEQ ID No. 10, [CYP3A4, 1454 base pairs after the stop 
codon, C>T] of SEQ ID No. 11, [CYP3A4JVS3 +1992T>C] of SEQ ID No. 12, 
[CYP3A4JVS9 +841T>G] of SEQ ID No. 13, [CYP3A4JVS12 -473T>G] of SEQ 
ID No. 14, [CYP3A4JVS12 +581C>T] of SEQ ID No. 15, [CYP3A4JVS12 

5 +586G>A] of SEQ ID No. 16, [CYP3A4JVS12 +6460A] of SEQ ID No. 17, 
[CYP3A4JVS3 -734G>A] of SEQ ID No. 18, [CYP17JVS1 -271 A>C] of SEQ ID 
No. 19, [CYP17JVS5 +750G] of SEQ ID No. 20, [CYP17JVS1 +426G>A] of 
SEQ ID No. 21, [CYP17JVS1 -990T] of SEQ ID No. 22, [CYP17JVS1 - 
700OG] of SEQ ID No. 23, [CYP17JVS1 -565G>A] of SEQ ID No. 24, 

io [CYP17JVS3 +141 A>T] of SEQ ID No. 25, [CYP17_5' region -1488C>G] of SEQ 
ID No. 26, [CYP17_5* region -1204OT] of SEQ ID No. 27, [CYP17JVS1 
+466G>A] of SEQ ID No. 28, [CYP17, 712 base pairs after the stop codon, G>A] 
of SEQ ID No. 29, [SRD5A2, 1356 base pairs after the stop codon (3* UTR), A>C] 
of SEQ ID No. 30, [SRD5A2, 849 base pairs after the stop codon (3' UTR), A>G] 

15 of SEQ ID No. 31 , [SRD5A2_5* region -870G>A] of SEQ ID No. 32, [SRD5A2_5' 
region -2036(A)7-8] of SEQ ID No. 33, [SRD5A2, 545 base pairs after the stop 
codon (3' UTR), T>C] of SEQ ID No. 34, [SRD5A2_IVS2+626C>T] of SEQ ID No. 
35, [SRD5A2_5' region -8029OT] of SEQ ID No. 36, [CYP3A4_IVS7+34T>G] 
of SEQ ID No. 42, [CYP3A4_5' region -12320T] of SEQ ID No. 43, [SRD5A2_5' 

20 region -3001 G>A] of SEQ ID No. 44, and [SRD5A2, 1 552 base pairs after the 
stop codon, G>A] of SEQ ID No. 45. 

Optionally, at least one single nucleotide polymorphism is the complement 
of any of the single nucleotide polymorphisms as hereinbefore described. 
Suitably, the oligonucleotide further comprises a detectable label. 

25 Preferably, the label is selected from the group consisting of fluorophore, 
radionuclide, peptide, enzyme, antibody or antigen. More preferably, the 
fluorophore is a fluorescent compound selected from the group consisting of 
Hoechst 33342, Cy2, Cy3, Cy5, CypHer, coumarin, FITC, DAPI, Alexa 633 
DRAQ5 and Alexa 488. 

30 In a seventh aspect of the present invention, there is provided a method of 

treatment or prophylaxis of a subject comprising the steps of 

i) analysing a biological sample containing nucleic acid obtained from 
the subject to detect the presence or absence of at least one single 
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nucleotide polymorphism in SEQ ID NOS 1-36 or SEQ ID NOS 42- 
45, or the complement thereof, associated with a disease, condition 
or disorder related to prostate or breast cancer; and 
ii) treating the subject for the disease, condition or disorder if step i) 
5 detects the presence of at least one single nucleotide polymorphism 

in SEQ ID NOS: 1-36 or SEQ ID NOS 42-45, or the complement 
thereof. 

Treatment may take a variety of forms depending upon the nature of the 
cancer. Hormonal therapy is a widely used treatment for patients with metastatic 

10 carcinoma of the prostate (Goethuys et al. (1 997) Am J Clin Oncol. 20, 40-45). 
Such treatment may, for example, involve androgen deprivation by surgical (e.g. 
orchiectomy) or androgen suppressive agents such as estrogens, 
(e.g.diethylstilbestrol), antiandrogens (e.g. flutamide) and luteinising hormone- 
releasing hormone agonists (e.g. leuprolide). Radiotherapy using radionuclides, 

15 such as 32 Phosphorus or 89 Strontium, can be an effective treatment for the 

disease. There is also growing interest in the development of vaccines (Slovin 
(2001) Hematol. Oncol. Clinic N. Am, 15, 477-496) or the use of gene 
therapeutic methods (Ferrer & Rodriguez (2001) Hematol Oncol Clinic of N. Am 
15, 497-508) for the treatment of prostate cancer. 

20 Suitably, the nucleic acid is selected from the group consisting of DNA, 

RNA and mRNA. 

Preferably, the sample is analysed to detect the presence or absence of at 
least one single nucleotide polymorphism at a position selected from the group 
consisting of position [CYP3A4JVS9 +187] of SEQ ID No. 1, position [CYP3A4, 

25 1639 base pairs after the stop codon] of SEQ ID No. 2, position [CYP3A4, 945 
base pairs after the stop codon] of SEQ ID No. 3, position [CYP3A4_5' region - 
747] of SEQ ID No. 4, position [CYP3A4_ IVS7 -202] of SEQ ID No. 5, position 
[CYP3A4, 2204 base pairs after the stop codon] of SEQ ID No. 6, position 
[CYP3A4_ IVS2 -132] of SEQ ID No. 7, position [CYP3A4JVS1 -868] of SEQ ID 

30 No. 8, position [CYP3A4_5' region -847] of SEQ ID No. 9, position [CYP3A4, 766 
base pairs after the stop codon] of SEQ ID No. 10, position [CYP3A4, 1454 base 
pairs after the stop codon] of SEQ ID No. 11, position [CYP3A4JVS3 +1992] of 
SEQ ID No. 12, position [CYP3A4JVS9 +841] of SEQ ID No. 13, position 
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[CYP3A4JVS12 -473] of SEQ ID No. 14, position [CYP3A4JVS12 +581] of SEQ 
ID No. 15, position [CYP3A4JVS12 +586] of SEQ ID No. 16, position 
[CYP3A4JVS12 +646] of SEQ ID No. 17, position [CYP3A4JVS3 -734] of SEQ 
ID No. 18, position [CYP17JVS1 -271] of SEQ ID No. 19, position [CYP17JVS5 
5 +75] of SEQ ID No. 20, position [CYP17JVS1 +426] of SEQ ID No. 21, position 
[CYP17JVS1 -99] of SEQ ID No. 22, position [CYP17JVS1 -700] of SEQ ID No. 
23, position [CYP17JVS1 -565] of SEQ ID No. 24, position [CYP17JVS3 +141] 
of SEQ ID No. 25, position [CYP17_5' region -1488] of SEQ ID No. 26, position 
[CYP17_5' region -1204] of SEQ ID No. 27, position [CYP17JVS1 +466] of SEQ 
ID No. 28, position [CYP17, 712 base pairs after the stop codon] of SEQ ID No. 
29, position [SRD5A2, 1356 base pairs after the stop codon (3' UTR)] of SEQ ID 
No. 30, position [SRD5A2, 849 base pairs after the stop codon (3' UTR)] of SEQ 
ID No. 31, position [SRD5A2_5' region -870] of SEQ ID No. 32, position 
[SRD5A2_5* region between -2036 and -2030] of SEQ ID No. 33, position 
[SRD5A2, 545 base pairs after the stop codon (3' UTR)] of SEQ ID No. 
34.position [SRD5A2JVS2+626] of SEQ ID No. 35, position [SRD5A2_5' region - 
8029] of SEQ ID No. 36, position [CYP3A4JVS7+34] of SEQ ID No. 42, position 
[CYP3A4_5' region -1232] of SEQ ID No. 43, position [SRD5A2_5* region -3001] 
of SEQ ID No. 44, and position [SRD5A2, 1552 base pairs after the stop codon] 
of SEQ ID No. 45. 

More preferably, at least one single nucleotide polymorphism is selected 
from the group consisting of [CYP3A4JVS9 +187C>G] of SEQ ID No. 1, 
[CYP3A4, 1639 base pairs after the stop codon, A>T] of SEQ ID No. 2, [CYP3A4, 
945 base pairs after the stop codon, A>7] of SEQ ID No. 3, [CYP3A4_5' region - 
747C>G] of SEQ ID No. 4, [CYP3A4_ IVS7 -202OT] of SEQ ID No. 5, [CYP3A4, 
2204 base pairs after the stop codon, G>C] of SEQ ID No. 6, [CYP3A4_ IVS2 - 
132C>T] of SEQ ID No. 7, [CYP3A4JVS1 -868C>T] of SEQ ID No. 8, 
[CYP3A4_5* region -847A>T] of SEQ ID No. 9, [CYP3A4, 766 base pairs after 
the stop codon, delT] of SEQ ID No. 10, [CYP3A4, 1454 base pairs after the stop 
codon, OT] of SEQ ID No. 11, [CYP3A4JVS3 +1992T>C] of SEQ ID No. 12, 
[CYP3A4JVS9 +841 T>G] of SEQ ID No. 13, [CYP3A4JVS12 -473T>G] of SEQ 
ID No. 14, [CYP3A4JVS12 +581 C>T] of SEQ ID No. 15, [CYP3A4JVS12 
+586G>A] of SEQ ID No. 16, [CYP3A4JVS12 +6460A] of SEQ ID No. 17, 



WO 2004/028346 N PCT/US2003/030359 

[CYP3A4JVS3 -734G>A] of SEQ ID No. 18, [CYP17JVS1 -271 A>C] of SEQ ID 
No. 19, [CYP17JVS5 +750G] of SEQ ID No. 20, [CYP17JVS1 +426G>A] of 
SEQ ID No. 21, [CYP17JVS1 -99C>T] of SEQ ID No. 22, [CYP17JVS1 - 
700OG] of SEQ ID No. 23, [CYP17JVS1 -565G>A] of SEQ ID No. 24, 
5 [CYP17JVS3 +141A>T] of SEQ ID No. 25, [CYP17_5* region -14880G] of SEQ 
ID No. 26, [CYP17_5* region -1204C>T] of SEQ ID No. 27, [CYP17JVS1 
+466G>A] of SEQ ID No. 28, [CYP17, 712 base pairs after the stop codon, G>A] 
of SEQ ID No. 29, [SRD5A2, 1356 base pairs after the stop codon (3' UTR), A>C] 
of SEQ ID No. 30, [SRD5A2, 849 base pairs after the stop codon (3' UTR), A>G] 
of SEQ ID No. 31, [SRD5A2_5' region -870G>A] of SEQ ID No. 32, [SRD5A2_5* 
region -2036(A)7-8] of SEQ ID No. 33, [SRD5A2, 545 base pairs after the stop 
codon (3* UTR), T>C] of SEQ ID No. 34, [SRD5A2JVS2+6260T] of SEQ ID No. 
35, [SRD5A2_5' region -8029C>T] of SEQ ID No. 36, [CYP3A4_IVS7+34T>G] of 
SEQ ID No. 42, [CYP3A4_5' region -12320T] of SEQ ID No. 43, [SRD5A2_5' 
region -3001OA] of SEQ ID No. 44, and [SRD5A2, 1552 base pairs after the 
stop codon, G>A] of SEQ ID No. 45. 

Optionally, at least one single nucleotide polymorphism is the complement 
of any of the single nucleotide polymorphisms hereinbefore described. 

Suitably, the method counteracts the effect of at least one single 
nucleotide polymorphism detected. 

In a first embodiment of the seventh aspect, the method comprises 
treatment with a polynucleotide selected from the group consisting of polymorphic 
sequences SEQ ID NOS 1-36 or SEQ ID NOS 42-45, or their complement, 
provided that the polymorphic sequence, or the complement, does not contain at 
least one single nucleotide polymorphism at a position selected from the group 
consisting of position [CYP3A4JVS9 +187] of SEQ ID No. 1, position [CYP3A4, 
1639 base pairs after the stop codon] of SEQ ID No. 2, position [CYP3A4, 945 
base pairs after the stop codon,] of SEQ ID No. 3, position [CYP3A4_5* region - 
747] of SEQ ID No. 4, position [CYP3A4_ IVS7 -202] of SEQ ID No. 5, position 
[CYP3A4, 2204 base pairs after the stop codon,] of SEQ ID No. 6, position 
[CYP3A4_ IVS2 -132] of SEQ ID No. 7, position [CYP3A4JVS1 -868] of SEQ ID 
No. 8, position [CYP3A4_5' region -847] of SEQ ID No. 9, position [CYP3A4, 766 
base pairs after the stop codon] of SEQ ID No. 10, position [CYP3A4, 1454 base 
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pairs after the stop codon] of SEQ ID No. 1 1, position [CYP3A4JVS3 +1992] of 
SEQ ID No. 12, position [CYP3A4JVS9 +841] of SEQ ID No. 13, position 
[CYP3A4JVS12 -473] of SEQ ID No. 14, position [CYP3A4JVS12 +581] of SEQ 
ID No. 15, position [CYP3A4JVS12 +586] of SEQ ID No. 16, position 
5 [CYP3A4JVS12 +646] of SEQ ID No. 17, position [CYP3A4JVS3 -734] of SEQ 
ID No. 18, position [CYP17JVS1 -271] of SEQ ID No. 19, position [CYP17JVS5 
+75] of SEQ ID No. 20, position [CYP17JVS1 +426] of SEQ ID No. 21, position 
[CYP17JVS1 -99] of SEQ ID No. 22, position [CYP17JVS1 -700] of SEQ ID No. 
23, position [CYP17JVS1 -565] of SEQ ID No. 24, position [CYP17JVS3 +141] 

10 of SEQ ID No. 25, position [CYP17_5' region -1488] of SEQ ID No. 26, position 
[CYP17_5* region -1204] of SEQ ID No. 27, position [CYP17JVS1 +466] of SEQ 
ID No. 28, position [CYP17, 712 base pairs after the stop codon] of SEQ ID No. 
29, position [SRD5A2, 1356 base pairs after the stop codon (3' UTR)] of SEQ ID 
No. 30, position [SRD5A2, 849 base pairs after the stop codon (3" UTR)] of SEQ 

is ID No. 31, position [SRD5A2_5' region -870] of SEQ ID No. 32, position 
[SRD5A2_5' region between -2036 and -2030] of SEQ ID No. 33, position 
[SRD5A2, 545 base pairs after the stop codon (3' UTR)] of SEQ ID No. 
34position [SRD5A2JVS2+626] of SEQ ID No. 35, position [SRD5A2_5* region - 
8029] of SEQ ID No. 36, position [CYP3A4JVS7+34] of SEQ ID No. 42, position 

20 [CYP3A4_5' region -1232] of SEQ ID No. 43, position [SRD5A2_5' region -3001] 
of SEQ ID No. 44, and position [SRD5A2, 1552 base pairs after the stop codon] 
of SEQ ID No. 45. 

Preferably, the polymorphic sequence does not contain at least one single 
nucleotide polymorphism selected from the group consisting of [CYP3A4JVS9 

25 +1870G] of SEQ ID No. 1 , [CYP3A4, 1639 base pairs after the stop codon, 
A>T] of SEQ ID No. 2, [CYP3A4, 945 base pairs after the stop codon, A>T] of 
SEQ ID No. 3, [CYP3A4_5' region -747C>G] of SEQ ID No. 4, [CYP3A4_ IVS7 - 
202OT] of SEQ ID No. 5, [CYP3A4, 2204 base pairs after the stop codon, G>C] 
of SEQ ID No. 6, [CYP3A4_ IVS2 -1320T] of SEQ ID No. 7, [CYP3A4JVS1 - 

30 868C>T] of SEQ ID No. 8, [CYP3A4_5' region -847A>T] of SEQ ID No. 9, 

[CYP3A4, 766 base pairs after the stop codon, delT] of SEQ ID No. 10, [CYP3A4, 
1454 base pairs after the stop codon, OT] of SEQ ID No. 1 1 , [CYP3A4JVS3 
+1992T>C] of SEQ ID No. 12, [CYP3A4JVS9 +841T>G] of SEQ ID No. 13, 
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[CYP3A4JVS12 -473T>G] of SEQ ID No. 14, [CYP3A4JVS12 +5810T] of SEQ 
ID No. 15, [CYP3A4JVS12 +586G>A] of SEQ ID No. 16, [CYP3A4JVS12 
+646C>A] of SEQ ID No. 17, [CYP3A4JVS3 -734G>A] of SEQ ID No. 18, 
[CYP17JVS1 -271A>C] of SEQ ID No. 19, [CYP17JVS5 +75C>G] of SEQ ID 

5 No. 20, [CYP17JVS1 +426G>A] of SEQ ID No. 21, [CYP17JVS1 -99C>T] of 
SEQ ID No. 22, [CYP17JVS1 -700C>G] of SEQ ID No. 23, [CYP17JVS1 - 
565G>A] of SEQ ID No. 24, [CYP17JVS3 +141A>T] of SEQ ID No. 25, 
[CYP17_5' region -14880G] of SEQ ID No. 26, [CYP17_5' region -1204C>T] of 
SEQ ID No. 27, [CYP17JVS1 +466G>A] of SEQ ID No. 28, [CYP17, 712 base 

10 pairs after the stop codon, G>A] of SEQ ID No. 29, [SRD5A2, 1356 base pairs 
after the stop codon (3' UTR), A>C] of SEQ ID No. 30, [SRD5A2, 849 base pairs 
after the stop codon (3' UTR), A>G] of SEQ ID No. 31, [SRD5A2_5* region - 
870G>A] of SEQ ID No. 32, [SRD5A2_5' region -2036(A)7-8] of SEQ ID No. 33, 
[SRD5A2, 545 base pairs after the stop codon (3* UTR), T>C] of SEQ ID No. 34, 

15 [SRD5A2_IVS2+626C>T] of SEQ ID No. 35, [SRD5A2_5' region -8029OT] of 
SEQ ID No. 36, [CYP3A4_IVS7+34T>G] of SEQ ID No. 42, [CYP3A4_5' region - 
1232C>T] of SEQ ID No. 43, [SRD5A2_5' region -3001G>A] of SEQ ID No. 44, 
and [SRD5A2, 1552 base pairs after the stop codon, G>A] of SEQ ID No. 45. 

Preferably, the polymorphic sequence does not contain at least one single 

20 nucleotide polymorphism which is the complement of any of the single nucleotide 
polymorphisms hereinbefore described. 

In a second embodiment of the seventh aspect, the method comprises 
treatment with a polypeptide which is encoded by a polynucleotide selected from 
the group consisting of polymorphic sequences SEQ ID NOS 1-36 and SEQ ID 

25 NOS 42-45 or their complement, provided that the polymorphic sequence, or the 
complement, does not contain at least one single nucleotide polymorphism at a 
position selected from the group consisting of position [CYP3A4JVS9 +187] of 
SEQ ID No. 1, position [CYP3A4, 1639 base pairs after the stop codon] of SEQ 
ID No. 2, position [CYP3A4, 945 base pairs after the stop codon] of SEQ ID No. 

30 3, position [CYP3A4_5' region -747] of SEQ ID No. 4, position [CYP3A4_ IVS7 - 
202] of SEQ ID No. 5, position [CYP3A4, 2204 base pairs after the stop codon] of 
SEQ ID No. 6, position [CYP3A4_ IVS2 -132] of SEQ ID No. 7, position 
[CYP3A4JVS1 -868] of SEQ ID No. 8, position [CYP3A4_5' region -847] of SEQ 
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ID No. 9, position [CYP3A4, 766 base pairs after the stop codon] of SEQ ID No. 
10, position [CYP3A4, 1454 base pairs after the stop codon] of SEQ ID No. 1 1, 
position [CYP3A4JVS3 +1992] of SEQ ID No. 12, position [CYP3A4JVS9 +841] 
of SEQ ID No. 13, position [CYP3A4JVS12 -473] of SEQ ID No. 14, position 

5 [CYP3A4JVS12 +581] of SEQ ID No. 15, position [CYP3A4JVS12 +586] of 
SEQ ID No. 16, position [CYP3A4JVS12 +646] of SEQ ID No. 17, position 
[CYP3A4JVS3 -734] of SEQ ID No. 18, position [CYP17JVS1 -271] of SEQ ID 
No. 19, position [CYP17JVS5 +75] of SEQ ID No. 20, position [CYP17JVS1 
+426] of SEQ ID No. 21, position [CYP17JVS1 -99] of SEQ ID No. 22, position 

10 [CYP17JVS1 -700] of SEQ ID No. 23, position [CYP17JVS1 -565] of SEQ ID 
No. 24, position [CYP17JVS3 +141] of SEQ ID No. 25, position [CYP17_5' 
region -1488] of SEQ ID No. 26, position [CYP17_5' region -1204] of SEQ ID No. 
27, position [CYP17JVS1 +466] of SEQ ID No. 28, position [CYP17, 712 base 
pairs after the stop codon] of SEQ ID No. 29, position [SRD5A2, 1356 base pairs 

is after the stop codon (3* UTR)] of SEQ ID No. 30, position [SRD5A2, 849 base 
pairs after the stop codon (3' UTR)] of SEQ ID No. 31, position [SRD5A2_5* 
region -870] of SEQ ID No. 32, position [SRD5A2_5' region between -2036 and - 
2030] of SEQ ID No. 33, position [SRD5A2, 545 base pairs after the stop codon 
(3" UTR)] of SEQ ID No. 34, position [SRD5A2JVS2+626] of SEQ ID No. 35, 

20 position [SRD5A2_5* region -8029] of SEQ ID No. 36, position 

[CYP3A4JVS7+34] of SEQ ID No. 42, position [CYP3A4_5' region -1232] of 
SEQ ID No. 43, position [SRD5A2_5" region -3001] of SEQ ID No. 44 and 
position [SRD5A2, 1552 base pairs after the stop codon] of SEQ ID No. 45. 

Preferably, the polymorphic sequence does not contain at least one single 

25 nucleotide polymorphism selected from the group consisting of [CYP3A4JVS9 
+1870G] of SEQ ID No. 1, [CYP3A4, 1639 base pairs after the stop codon, 
A>T] of SEQ ID No. 2, [CYP3A4, 945 base pairs after the stop codon, A>T] of 
SEQ ID No. 3, [CYP3A4_5" region -747C>G] of SEQ ID No. 4, [CYP3A4_ IVS7 - 
202OT] of SEQ ID No. 5, [CYP3A4, 2204 base pairs after the stop codon, G>CJ 

30 of SEQ ID No. 6, [CYP3A4_ IVS2 -1320T] of SEQ ID No. 7, [CYP3A4JVS1 - 
868C>T] of SEQ ID No. 8, [CYP3A4_5' region -847A>T] of SEQ ID No. 9, 
[CYP3A4, 766 base pairs after the stop codon, delT] of SEQ ID No. 10, [CYP3A4, 
1454 base pairs after the stop codon, C>T] of SEQ ID No. 1 1 , [CYP3A4JVS3 
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+1992T>C] of SEQ ID No. 12, [CYP3A4JVS9 +841T>G] of SEQ ID No. 13, 
[CYP3A4JVS12 -473T>G] of SEQ ID No. 14, [CYP3A4JVS1 2 +5810T] of SEQ 
ID No. 15, [CYP3A4JVS12 +586G>A] of SEQ ID No. 16, [CYP3A4JVS12 
+646C>A] of SEQ ID No. 17, [CYP3A4JVS3 -734G>A] of SEQ ID No. 18, 

5 [CYP17JVS1 -271A>C] of SEQ ID No. 19, [CYP17JVS5 +750G] of SEQ ID 
No. 20, [CYP17JVS1 +426G>A] of SEQ ID No. 21, [CYP17JVS1 -990T] of 
SEQ ID No. 22, [CYP17JVS1 -700OG] of SEQ ID No. 23, [CYP17JVS1 - 
565G>A] of SEQ ID No. 24, [CYP17JVS3 +141A>T] of SEQ ID No. 25, 
[CYP17_5' region -14880G] of SEQ ID No. 26, [CYP17_5' region -1204OT] of 

10 SEQ ID No. 27, [CYP17JVS1 +466G>A] of SEQ ID No. 28, [CYP17, 712 base 
pairs after the stop codon, G>A] of SEQ ID No. 29, [SRD5A2, 1356 base pairs 
after the stop codon (3' UTR), A>C] of SEQ ID No. 30, [SRD5A2, 849 base pairs 
after the stop codon (3* UTR), A>G] of SEQ ID No. 31, [SRD5A2_5' region - 
870G>A] of SEQ ID No. 32, [SRD5A2_5" region -2036(A)7-8] of SEQ ID No. 33, 

15 [SRD5A2, 545 base pairs after the stop codon (3' UTR), T>C] of SEQ ID No. 34, 
[SRD5A2JVS2+6260T] of SEQ ID No. 35, [SRD5A2_5' region -8029OT] of 
SEQ ID No. 36,[CYP3A4_IVS7+34T>G] of SEQ ID No. 42, [CYP3A4_5' region - 
12320T] of SEQ ID No. 43, SRD5A2_5' region -3001G>A] of SEQ ID No. 44, 
and [SRD5A2, 1552 base pairs after the stop codon, G>A] of SEQ ID No. 45. 

20 Suitably, the polymorphic sequence does not contain at least one single 

nucleotide which is the complement of any of the single nucleotide 
polymorphisms as hereinbefore described. 

In a third embodiment of the seventh aspect, the method comprises 
treatment with an antibody that binds specifically with a polypeptide encoded by a 

25 polynucleotide selected from the group consisting of SEQ ID NOS 1-34, or SEQ 
ID NOS 42-45, or the complement thereof. 

According to an eighth aspect of the present invention, there is provided a 
method for predicting the genetic ability of a subject or an organism to metabolise 
a chemical, the method comprising analysing a biological sample containing 

30 nucleic acid obtained from the subject or organism to detect the presence or 
absence of one or more single nucleotide polymorphisms at a position selected 
from the group consisting of position [CYP3A4JVS9 +187] of SEQ ID No. 1, 
position [CYP3A4, 1639 base pairs after the stop codon] of SEQ ID No. 2, 
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position [CYP3A4, 945 base pairs after the stop codon] of SEQ ID No. 3, position 
[CYP3A4_5' region -747] of SEQ ID No. 4, position [CYP3A4_ IVS7 -202] of SEQ 
ID No. 5, position [CYP3A4, 2204 base pairs after the stop codon] of SEQ ID No. 
6, position [CYP3A4_ IVS2 -132] of SEQ ID No. 7, position [CYP3A4JVS1 -868] 
5 of SEQ ID No. 8, position [CYP3A4_5' region -847] of SEQ ID No. 9, position 
[CYP3A4, 766 base pairs after the stop codon] of SEQ ID No. 10, position 
[CYP3A4, 1454 base pairs after the stop codon] of SEQ ID No. 11, position 
[CYP3A4JVS3 +1992] of SEQ ID No. 12, position [CYP3A4JVS9 +841] of SEQ 
ID No. 13, position [CYP3A4JVS12 -473] of SEQ ID No. 14, position 

10 [CYP3A4JVS12 +581] of SEQ ID No. 15, position [CYP3A4JVS12 +586] of 
SEQ ID No. 16, position [CYP3A4JVS 12 +646] of SEQ ID No. 17, position 
[CYP3A4JVS3 -734] of SEQ ID No. 18, position [CYP17JVS1 -271] of SEQ ID 
No. 19, position [CYP17JVS5 +75] of SEQ ID No. 20, position [CYP17JVS1 
+426] of SEQ ID No. 21, position [CYP17JVS1 -99] of SEQ ID No. 22, position 

15 [CYP17JVS1 -700] of SEQ ID No. 23, position [CYP17JVS1 -565] of SEQ ID 
No. 24, position [CYP17JVS3 +141] of SEQ ID No. 25, position [CYP17_5' 
region -1488] of SEQ ID No. 26, position [CYP17_5' region -1204] of SEQ ID No. 
27, position [CYP17JVS1 +466] of SEQ ID No. 28, position [CYP17, 712 base 
pairs after the stop codon] of SEQ ID No. 29, position [SRD5A2, 1356 base pairs 

20 after the stop codon (3' UTR)] of SEQ ID No. 30, position [SRD5A2, 849 base 
pairs after the stop codon (3' UTR)] of SEQ ID No. 31, position [SRD5A2_5" 
region -870] of SEQ ID No. 32, position [SRD5A2_5" region between -2036 and - 
2030] of SEQ ID No. 33, position [SRD5A2, 545 base pairs after the stop codon 
(3' UTR)] of SEQ ID No. 34, position [SRD5A2JVS2+626] of SEQ ID No. 35, 

25 position [SRD5A2_5' region -8029] of SEQ ID No. 36, position 

[CYP3A4JVS7+34] of SEQ ID No. 42, position [CYP3A4_5' region -1232] of 
SEQ ID No. 43, position [SRD5A2_5' region -3001] of SEQ ID No. 44, and 
position [SRD5A2, 1552 base pairs after the stop codon] of SEQ ID No. 45. 

Wherein the presence of a polymorphism at one or more of the positions is 

30 indicative of the subject's or organism's ability or inability to metabolise the 
chemical. 

Preferably, the analysis comprises detecting the presence or absence of 
one or more single nucleotide polymorphisms selected from the group consisting 
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of [CYP3A4JVS9 +1870G] of SEQ ID No. 1, [CYP3A4, 1639 base pairs after 
the stop codon, A>T] of SEQ ID No. 2, [CYP3A4, 945 base pairs after the stop 
codon, A>T] of SEQ ID No. 3, [CYP3A4_5' region -7470G] of SEQ ID No. 4, 
[CYP3A4_ IVS7 -202OT] of SEQ ID No. 5, [CYP3A4, 2204 base pairs after the 
5 stop codon, G>C] of SEQ ID No. 6, [CYP3A4_ IVS2 -1320T] of SEQ ID No. 7, 
[CYP3A4JVS1 -868OT] of SEQ ID No. 8, [CYP3A4_5' region -847A>T] of SEQ 
ID No. 9, [CYP3A4, 766 base pairs after the stop codon, delT] of SEQ ID No. 10, 
[CYP3A4, 1454 base pairs after the stop codon, C>T] of SEQ ID No. 11, 
[CYP3A4JVS3 +1992T>C] of SEQ ID No. 12, [CYP3A4JVS9 +841T>G] of SEQ 
10 ID No. 1 3, [CYP3A4JVS1 2 -473T>G] of SEQ I D No. 14, [CYP3A4J VS1 2 
+5810T] of SEQ ID No. 15, [CYP3A4JVS12 +586G>A] of SEQ ID No. 16, 
[CYP3A4JVS12 +646C>A] of SEQ ID No. 17, [CYP3A4JVS3 -734G>A] of SEQ 
ID No. 18, [CYP17JVS1 -271A>C] of SEQ ID No. 19, [CYP17JVS5 +75C>G] of 
SEQ ID No. 20, [CYP17JVS1 +426G>A] of SEQ ID No. 21, [CYP17JVS1 - 
99C>T] of SEQ ID No. 22, [CYP17JVS1 -700C>G] of SEQ ID No. 23, 
[CYP17JVS1 -565G>A] of SEQ ID No. 24, [CYP17JVS3 +141A>T] of SEQ ID 
No. 25, [CYP17_5* region -1488C>G] of SEQ ID No. 26, [CYP17_5' region - 
1204C>T] of SEQ ID No. 27, [CYP17JVS1 +466G>A] of SEQ ID No. 28, 
[CYP17, 712 base pairs after the stop codon, G>A] of SEQ ID No. 29, [SRD5A2, 
1356 base pairs after the stop codon (3' UTR), A>C] of SEQ ID No. 30, [SRD5A2, 
849 base pairs after the stop codon (3* UTR), A>G] of SEQ ID No. 31, 
[SRD5A2_5" region -870G>A] of SEQ ID No. 32, [SRD5A2_5* region -2036(A)7- 
8] of SEQ ID No. 33, [SRD5A2, 545 base pairs after the stop codon (3* UTR), 
T>C] of SEQ ID No. 34, [SRD5A2_IVS2+626C>T] of SEQ ID No. 35, 
[SRD5A2_5' region -8029OT] of SEQ ID No. 36, [CYP3A4_IVS7+34T>G] of 
SEQ ID No. 42, [CYP3A4_5' region -1232C>T] of SEQ ID No. 43, [SRD5A2_5' 
region -3001 G>A] of SEQ ID No. 44, and [SRD5A2, 1552 base pairs after the 
stop codon, G>A] of SEQ ID No. 45. 

Preferably, the method further comprises predicting the response of the 
subject or the organism to the chemical by their ability or inability to metabolise 
the chemical. 

Suitably, the chemical is a drug or a xenobiotic. 
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Suitably, the organism is selected from the group consisting of bacterium, 
fungus, protozoa, alga, insect, nematode, amphibian, plant, fish and mammal. 

In a ninth aspect of the present invention, there is provided a vector 
comprising a polynucleotide selected from the group consisting of a nucleotide 
5 sequence comprising one or more polymorphic sequences of SEQ ID NOS 1-34 
orSEQIDNOS42-45. 

In a tenth aspect of the present invention, there is provided a host cell 
transformed with the vector hereinbefore described. 

Preferably, the host cell is selected from the group consisting of, 
10 bacterium, fungus, protozoa, alga, insect, nematode, amphibian, plant, fish and 
mammal. More preferably the mammalian cell is a human cell. 

In an eleventh aspect of the present invention, there is provided a method 
of metabolising a chemical using the host cell as hereinbefore described. 

In a twelfth aspect of the present invention, there is provided a method for 
15 making a host cell resistant to a chemical, the method comprising transforming a 
cell with any of the polynucleotides or with any of the vectors as hereinbefore 
described. 

In a thirteenth aspect of the present invention, there is provided an isolated 
haplotype selected from the group consisting of CYP3A4_Hap4 and 
20 SRD52_Hap3. 

Preferably, the isolated CYP3A4_Hap4 haplotype consists of Allele T at 
[CYP3A4_5' region -12320T], Allele C at [CYP3A4_5' region -7470G], Allele 
G at [CYP3A4_5 J region -392A>G], Allele G at [CYP3A4JVS7+34T>G], Allele T 
at [CYP3A4JVS7-202OT], Allele G at [CYP3A4_stop+766T>G], Allele C at 
25 [CYP3A4_stop+1454C>T], Allele T at [CYP3A4_stop+1639A>T] and Allele C at 
[CYP3A4_stop+2204G>C]. 

Preferably, the isolated SRD52JHap3 haplotype consists of Allele C at 
[SRD5A2_5 I region -8029OT], Allele G at [SRD5A2_5' region -3001 G>A], 
Allele G at [SRD5A2_J45G>A], Allele G at [SRD5A2_265G>C], Allele T at 
30 [SRD5A2JVS2+6260T], Allele G at [SRD5A2_stop+1 552G>A], Allele G at 
[SRD5A2_stop+3059G>A] and Allele G at [SRD5A2_stop+9301G>C]. 

In a fourteenth aspect of the present invention, there is provided a method 
for diagnosing a genetic susceptibility for a disease, condition or disorder related 
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to prostate or breast cancer in a subject, the method comprising analysing a 
biological sample obtained from the subject to detect the presence or absence of 
a haplotype as hereinbefore described. 

In a fifteenth aspect of the present invention, there is provided a method of 
5 diagnosing a genetic susceptibility for a disease, condition or disorder related to 
prostate or breast cancer in a subject, the method comprising adding an antibody 
to a polypeptide present in a sample obtained from the subject, which polypeptide 
is encoded by a haplotype as hereinbefore described, or the complement thereof, 
and detecting specific binding of the antibody to the polypeptide. 

In a sixteenth aspect of the present invention, there is provided a method 
of treatment or prophylaxis of a subject comprising the steps of 

i) analysing a sample of biological material containing a nucleic acid 
obtained from the subject to detect the presence or absence of at 
least one haplotype as hereinbefore described, or the complement 
thereof, associated with a disease, condition or disorder related to 
prostate or breast cancer; and 

ii) treating the subject for the disease, condition or disorder if step i) 
detects the presence of at least one haplotype, or the complement 
thereof. 

Preferably, the method comprises treatment with a portion of the isolated 
CYP3A4_Hap4 haplotype as hereinbefore described wherein the portion of the 
haplotype does not consist of at least one allele from the group consisting of 
Allele T at [CYP3A4_5' region -12320T], Allele C at [CYP3A4_5' region - 
747C>G], Allele G at [CYP3A4_5' region -392A>G], Allele G at 
[CYP3A4JVS7+34T>G], Allele T at [CYP3A4JVS7-202OT], Allele G at 
[CYP3A4_stop+766T>G], Allele C at [CYP3A4_stop+1454C>T], Allele T at 
[CYP3A4_stop+1639A>T] and Allele C at [CYP3A4_stop+2204G>C]. 

Optionally, the method comprises treatment with a portion of the the 
isolated SRD5A2_Hap3 haplotype as hereinbefore described wherein the portion 
of the haplotype does not comprise of at least one allele from the group 
consisting of Allele C at [SRD5A2_5' region -8029OT], Allele G at [SRD5A2_5' 
region -3001 G>A], Allele G at [SRD5A2_145G>A], Allele G at 
[SRD5A2_265G>C], Allele T at [SRD5A2JVS2+6260T], Allele G at 
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[SRD5A2_stop+1552G>A], Allele G at [SRD5A2_stop+3059G>A] and Allele G at 
[SRD5A2_stop+9301 G>C]. 

Brief Description of the Figures 

5 Figure 1 illustrates the Testosterone Biosynthetic Pathway. 

Figures 2A, 2B, and 2C show the location and allele frequencies of selected 
SNPs in CYP17A1 (FIG. 2A), CYP3A4 (FIG. 2B), and SRD5A2 (FIG. 2C), 
together with the major haplotypes. Solid black triangles refer to the locations of 
novel SNPs while white triangles denote locations of known SNPs. All haplotypes 
10 with frequency £3% in at least one of the four sub-groups (European 

Americans(EA) l African Americans(AA), cases, controls) are given, along with 
their case and control frequencies. Composite haplotype refers to all the 
remaining rare haplotypes pooled together. 

15 Detailed Description of the Invention 
Approach 

A two-phase study was undertaken of CYP17, CYP3A4, and SRD5A2, to 
evaluate the relationship between their genotypes/haplotypes and prostate 
cancer. Phase I of the study first searched for single nucleotide polymorphisms 

20 (SNPs) in these genes by re-sequencing 24 individuals from Coriell 

Polymorphism Discovery Resource (Coriell Cell Repositories, Camden, NJ), 
approximately 1 00 men from prostate cancer case-control sibships, and by 
leveraging public databases. Eighty-seven SNPs were discovered and genotyped 
in 276 men from case-control sibships. Those SNPs exhibiting preliminary case- 

25 control allele frequency differences, or distinguishing (i.e., 'tagging') common 
haplotypes across the genes, were identified for further study (24 SNPs total). In 
Phase II of the study, the 24 SNPs were genotyped in an additional 841 men 
from case-control sibships. Finally, associations between genotypes/haplotypes 
in CYP17, CYP3A4, and SRD5A2 and prostate cancer were evaluated in the total 

30 case-control sample of 1,117 brothers. 
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Subjects 

A family-based association study population of 1,1 17 men (637 cases, 480 
controls) was recruited between January 1998 and January 2001 from the major 
medical institutions in the greater Cleveland area and from the Henry Ford Health 

5 System in Detroit The study was approved by the collaborating institution's 
Review Boards, and informed consent was obtained from all participating men. 
Characteristics of the study population have been described (Casey et al. (2002) 
Nat Genet 32, 581-583). 

Men diagnosed with histologically confirmed prostate cancer at age 73 or 

10 younger were invited to join the study if they had a living unaffected brother who 
was either older than the proband, or at most eight years younger than the age at 
diagnosis of the proband. This age restriction was selected in an attempt to 
increase the potential for genetic factors affecting disease, and to help make 
certain that the controls were not unaffected due simply to being of a younger 

15 age. To help confirm that the controls were not diseased, theprostate specific 
antigen (PSA) levels in their blood was tested. Individuals in the study with PSA 
levels above 4 ng/ml were retained as 'controls' unless a subsequent diagnosis of 
prostate cancer was made, at which time they were reclassified as cases. 
Keeping them in the study was important because automatically excluding men 

20 with elevated PSA levels regardless of their ultimate prostate cancer status can 
lead to biased estimates of association (Lubin & Hartge (1984) Am J Epidemiol 
120, 791-793; Poole (1999) Am J Epidemiol 150, 547-551). Information on the 
cases' Gleason score (a measure of prostate cancer cellular differentiation) and 
tumor stage (TNM, tumor-node-metastasis stage) was determined from their 

25 medical records. The study population was comprised of 90% Caucasians 
(European Americans), and the remainder primarily African American (9%). 

Polymorphism discovery 

Polymorphisms were discovered by sequencing individuals from prostate 
30 cancer sibships (67 cases and 43 controls for CYP17 and CYP3A4, and 51 cases 
and 41 controls for SRD5A2). Of the 1 10 individuals sequenced for CYP17 and 
CYP3A4, 106 were Caucasian, 2 were Hispanic, and 2 were African-American. 
Of the 92 individuals sequenced for SRD5A2, 84 were Caucasian and 8 were 



WO 2004/028346 




PCT/US2003/030359 



African American. In addition, the 24 individuals from the Coriell Cell Repository 
Polymorphism Discovery Resource (Collins et al. (1998) Genome Res 8, 1229- 
1231) were sequenced against the three genes. 

PCR primers covering coding regions, splice sites, 5' and 3' regions, and 
5 parts of introns of CYP3A4 (reference sequence No. 39), CYP1 7 (reference 
sequence No. 40), and SRD5A2 (reference sequence No. 41), were designed 
using the Primer3 program (http://www.genome.wi.mit.edu/cgi- 
bin/primer/primer3.cgi). PCR products were sequenced using energy transfer dye 
terminators on the Amersham Bioscience's MegaBACEIOOO (Amersham 
Biosciences, Sunnyvale, California) using standard protocols. Sequence analysis 
was performed by assigning quality values (Phred; University of Washington, 
Seattle, Washington), assembling contigs (Phrap; University of Washington), 
automated identification of candidate heterozygote SNPs (PolyPhred, University 
of Washington), automated identification of candidate homozygote SNPs (High 
Quality Mismatch, Amersham Biosciences, Sunnyvale, California) and by 
operator confirmation (Consed, University of Washington). All polymorphisms 
were confirmed by Single Nucleotide Primer Extension (SNuPE) assay 
(Amersham Biosciences, Sunnyvale, California) 

In addition to novel polymorphisms discovered in this study, several 
publicly available SNPs from the dbSNP (http://www.ncbi.nlm.nih.gov/SNP/), 
Utah Genome Center (UGC) (http://www.genome.utah.edu/genesnps/genes/), 
the Human Cytochrome P450 Allele Nomenclature Committee (HCANC) 
(http://www.imm.ki.se/CYPalleles/), the Human Gene Mutation Database 
(HGMD) (http://archive.uwcm.ac.uk/uwcm/mg/hgmdO.html) and the Human Genie 
Bi-Allelic SEquences (HGBASE) Release 8 (http://hgbase.interactiva.de/) were 
searched for CYP17, CYP3A4, and SRD5A2. For the Androgen Receptor gene, 
several publicly available SNPs from dbSNP, HGBASE and the Androgen 
Receptor Mutation Database (ARMD) (http://ww2.mcgill.ca/androgendb/) were 
included. 



Genotyping 

In Phase I, 276 individuals from prostate cancer sibships were genotyped 
for 29 SNPs (11 novel, 18 known) in CYP17, 33 SNPs (18 novel, 15 known) in 
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CYP3A4, and 25 SNPs (5 novel, 20 known) in SRD5A2. The individuals included 
153 cases and 123 brother controls, 70% European Americans and 30% African 
Americans. The information from the 276 men was then used to determine initial 
case-control frequency differences and haplotype tagging. The results were then 

5 used to determine which SNPs should be genotyped in the remainder of the 
study population (i.e. in Phase II of the study). 

In Phase II, a total of 24 SNPs were genotyped in 841 individuals, giving 
information on a total of 11 17 individuals for Phase il . 

Genotyping was performed utilizing the Single Nucleotide Primer 

10 Extension (SNuPE) assay on the MegaBACE1000 (Amersham Biosciences, 

Sunnyvale California) capillary electrophoresis platform (Amersham Biosciences). 
The Primer3 program (http://www.genome.wi.mit.edu/cgi-bin/primer/primer3.cgi) 
was used to design PCR primers to amplify regions containing the SNPs of 
interest. PCR fragments were purified with 0.5U of Shrimp Alkaline Phosphatase 

is (Amersham Biosciences) and 10U of Exonuclease I (Amersham Biosciences) by 
incubating at 37°C for 40 min and at 85°C for 15 min. The single base extension 
(SBE) reaction was set with 1 pmol of HPLC purified SBE primer, 2-4 ^l of 
SNuPe Premix (Amersham Biosciences), 2-4 \xi of sterile water, and 1 jlxI of 
purified PCR fragment, and incubated at 25 cycles of 96°C for 10 sec, 50°C for 5 

20 sec, and 60°C for 10 sec. For phase I of the study, SNuPe reactions were set in 
96-well plates at 10 ^il volume and purified with AutoSeq™96 Plates (Amersham 
Biosciences) prior to injecting into the MegaBACE1000 system. For phase II of 
the study, SNuPe reactions were set in 384-well plates at 5-6 \i\ volume, diluted 
with 3-4 jal of sterile water and purified with 1 U of Shrimp Alkaline Phosphatase 

25 (Amersham Biosciences) by incubating at 37°C for 45 min and at 85°C for 15 min 
prior to injecting into the MegaBACE4000 system. In cases where low signal was 
anticipated (due to faint PCR), SNuPe reactions were desalted using a custom 
384-well filter plate incorporating modified size-exclusion technology (Millipore 
Corporation, Billerica, MA). The Scierra Genotyping LWS™ (Amersham 

30 Biosciences) system was utilized for the tracking and management of samples 
and laboratory activity for Phase II of the study. 

Specific software (SNPriDe) was developed for the automated design of 
SNuPE primers. Using a purified PCR fragment containing the SNP of interest as 
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a template, a third, internal primer was designed so that the 3' end anneals 

adjacent to the polymorphic base-pair, and during the SNuPE reaction a 

fluorescently labeled dideoxynucleotide (terminator) was added onto the primer. 

A separate software package has been developed (SNP Profiler™, Amersham 
5 Biociences) that automatically processes the signal data and outputs the 

maximum likelihood SNP genotypes. The system includes a user interface for 

editing and verification. 

Three SNPs. SRD5A2_SNP20 (V89L), SRD5A2_SNP22 (A49T) and 

CYP17-_SNP29(-34>C) were analysed by restriction enzyme digestion (Cicek et 
10 al M unpublished data). 

i 

Proofreading genotype data 

A large number of haplotypes inferred during initial rounds of haplotyping 
implied erroneous genotype data. A phylogenetic study of inferred haplotypes 

15 was performed to reveal the relationships between different haplotypes. All 
haplotypes differing from another haplotype by only one SNP, and being 
represented by only one individual, were subject to inspection. Genotype data for 
the individual at stake were reanalysed by SNP Profiler™ (Amersham 
Biosciences) to exclude the possibility of an incorrect genotype. Rounds of 

20 phylogenetic study of haplotypes, followed by reanalysing suspicious genotypes 
and inferring new haplotypes were applied until no more incorrect genotypes 
could be found. Three to six rounds were applied for each of the genes. 

Haplotyping 

25 Alleles within each of the three candidate genes were in strong linkage 

disequilibrium with one another. Thus, for each gene, haplotypes were estimated 
using the resulting genotypes, by disease status and within major ethnic groups 
using the software PHASE. This program uses Markov chain Monte Carlo to 
estimate haplotypes, imputes information for missing genotypes, and 

30 incorporates a statistical model for the distribution of unresolved haplotypes 
based on coalescent theory (Stephens et al. (2001) Am J Hum Genet 68, 978- 
989). 
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Haplotypes and haplotype tagging SNPs were first determined among the 
276 men genotyped for Phase I of the study, where tagging SNPs was necessary 
to define the most common haplotypes (e.g., >5%). After completing genotyping 
on the entire study population (Phase II of the study), the resulting data were 
5 used to estimate haplotypes. 

Association analysis 

Case versus control allele frequencies were first compared within major 
ethnic groups. Then the association between the resulting genotypes/haplotypes 

10 and prostate cancer risk was evaluated by calculating odds ratios (OR, estimates 
of relative risk) and 95% confidence intervals from conditional logistic regression 
with family as the matching variable, using a robust variance estimator that 
incorporates familial correlations. This is a standard approach for analyzing 
sibling matched case-control data, although sibling sets without any controls do 

is not contribute any information (197 cases total here) (Breslow and Day (1980) 
IARC Sci Publ 32, 335-338). In the analyses of CYP17, CYP3A4, and SRD5A2 
a log-additive coding was used which treats the most common polymorphism (or 
haplotype) as the null-risk referent group and assumes that the relative risk of 
carrying one polymorphism (or haplotype) is the square-root of the risk of carrying 

20 two. Since haplotypes were estimated for these three genes, the probabilities of 
observed haplotypes were used in the analyses (Schaid et al. (2002) Am J Hum 
Genet 70, 425-434). 

To control for potential confounding, age was adjusted for in all regression 
models. In addition to looking at the main effects of each SNP or haplotype, the 

25 analyses were also stratified by the case's disease aggressiveness, where high 
aggressiveness was defined by TNM stage ^ T2B or Gleason score > 7; and low 
aggressiveness by TNM stage < T2B and Gleason score < 7. All statistical 
analyses were undertaken with the S+ software (version 6.0, Insightful Corp, 
2001). 

30 

Polymorphism discovery (Phase I) 

A total of 34 novel SNPs were detected: 1 1 in CYP17, 18 in CYP3A4, and 
5 in SRD5A2 (Table 2). In addition, 1 1 SNPs were "rediscovered" from the public 
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databases. Including these 11 SNPs, 53 SNPs were selected in total from the 
databases: 18 in CYP17, 15 in CYP3A4, and 20 in SRD5A2. These were chosen 
based on the intention to obtain an even distribution of SNPs across the genes 
and the availability in the databases at that time (January-April 2001). Twenty- 
5 one SNPs were chosen from dbSNP, 27 from GeneSNPs, 12 from HGMD, 8 from 
HGVbase, and 2 from HCANC (the total number of SNPs listed here exceeds 53 
as several SNPs were present in multiple databases). Table 3 lists all 87 SNPs 
(34 novel, 53 from databases), with their origins, exact locations and allele 
frequencies. 

10 Among the 34 novel SNPs, 26 (76%) were discovered in both the Coriell 

and case-control populations. Three SNPs were only observed in the Coriell data, 
and the remaining five were found only in the prostate cancer sibships. Of these 
five, three were relatively rare (allele frequencies 0.2-1.5%), suggesting that they 
may not have been discovered in the Coriell population simply due to its small 

is sample size (n=24). Nevertheless, the other two SNPs that were only found in the 
prostate cancer sibships (CYP3A4_SNP12 and CYP17_SNP42) showed higher 
allele frequencies (7.5% and 21.8%, respectively), suggesting that they might be 
specific to the prostate cancer case-control population. 

20 Genotypying and Haplotyping 

Phase I 

The 87 SNPs were geneotyped in a total of 276 males from prostate 
cancer sibships (29 in CYP17, 33 in CYP3A4, and 25 in SRD5A2). Eleven SNPs 
gave ambiguous genotyping results. This might have been due to unoptimized 

25 genotyping reactions or primer self-priming due to secondary structures and 
unspecificity of PCR and/or SNuPe primers, especially within the Cytochrome 
P450 gene family. Of the remaining 76 SNPs, a similar percentage of those novel 
(41%, or 12/29) and known (38%, or 18/47) had allele frequencies >10%. 
However, 19/47 (40%) of the known SNPs were found to be monoallelic in the 

30 276 men, suggesting that they are either extremely rare, population specific, or 
artifacts. 

In light of these results, the 11 SNPs with ambiguous genotype results, the 
19 SNPs that appeared monoallelic in all samples tested, and an additional four 
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that were seen only in the Coriell Diversity Set but not in the prostate cancer 
sibships were excluded. Also excluded was one SNP because >15% of data was 
missing (due to a low success rate for PCR and SNuPe reaction). Finally, 12 
SNPs were excluded because their minor allele frequencies were less than 5% in 

5 all of the following four subgroups: European Americans, African Americans, 
cases, and controls (Table 3). Following these exclusions, a total of 40 SNPs 
remained for consideration in the Phase II association study (14 in CYP17, 16 in 
CYP3A4, and 10 in SRD5A2) (Table 3). 

Using the preliminary genotype information, haplotypes estimated with a 

10 frequency >5% in at least one of the four major subgroups (i.e., European 

American, African American, cases, or controls) were identified. Each gene had a 
single "common" haplotype, with a frequency ranging between 42 and 51 percent 
(not shown). Haplotype tagging SNPs were identified and used as a basis for 
inclusion in Phase II of the study. In addition, non-tagging SNPs exhibiting 

15 suggestive case versus control allele frequencies were considered (Table 3). 
Altogether 24 SNPs were selected for Phase II. 

Phase II 

The 24 tagging and suggestive SNPs were genotyped in an additional 841 
20 men, giving information on a total of 1 1 17 individuals for Phase II. Case versus 
control allele frequency differences by ethnic group are presented in Table 3. 
Haplotypes estimated with a frequency >3% in at least one of the four major 
subgroups of the study population were identified. The major haplotypes for 
CYP17, CYP3A4, and SRD5A2 along with their frequencies are presented in 
25 Figure 2. 



Association analyses 

In the association analyses, no associations between CYP17 
genotypes/haplotypes and prostate cancer were detected. When looking at 
30 CYP3A4, SNP1 was found to be associated with an approximately 50% reduction 
in risk (OR=0.53, 95% CI=0.29-0.99; p-va!ue=0.05) (Table 4A). Furthermore, the 
haplotype analysis revealed an association with an approximately 55% decrease 
in prostate cancer risk and CYP3A4_Hap4 (OR=0.46, 95% Cl=0.21-1.02; p- 
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value=0.05) (Table 5A). Two SNPs in SRD5A2 were also found to be associated 
with an approximately 50% increase in prostate cancer risk: SRD5A2_SNP26 
(OR=1.57, 95% Cl=1 .08-2.30; p-value=0.02), and SRD5A2_SNP20 (V89L) 
(OR=1.56, 95% Cl=1. 08-2.25; p-value=0.02) (Table 4A). These SNPs, however, 

5 were in almost complete linkage disequilibrium. 

When the study population was stratified by high and low aggressiveness 
of prostate cancer, several interesting associations emerged (see Table 4B and 
5B). First, five SNPs in CYP3A4 showed statistically significant associations with 
low aggressiveness: CYP3A4_SNP11 (CYP3A4*1B) (OR=0.20, 95% CN0.06- 

10 0.67; p-value=0.009), CYP3A4_SNP47 (OR=0.19, 95% CI=0.06-0.62; p- 
value=0.006), CYP3A4_SNP1 (OR=0.21, 95% CI=0.05-0.86; p-value=0.03), 
CYP3A4_SNP25 (OR=6.54, 95% Cl=0.99-43.10; p-value=0.05) and 
CYP3A4_SNP15 (OR=0.41, 95% CI=0.22-0.79; p-value=0.007). Second, an 
association was observed between CYP3A4_Hap4 and low aggressiveness 

15 (OR=0.06, 95% Cl=0.008-0.50; p-value=0.009) (Table 5B). Finally, an inverse 
association was observed between SRD5A2JHap3 and high aggressiveness 
(OR=0.52, 95% Cl=0.29-0.91; p-value=0.02) (Table 5B). 

Table 6 provides annotation of CYP3A4, CYP17an6 SRD5A2 genomic 
sequences. 

20 All of the SNPs disclosed-in the present invention have utility in the 

prognosis and diagnosis of prostate and breast cancer. 

Although this invention has been described in terms of certain preferred 
embodiments, other embodiments which will be apparent to those of ordinary skill 
in the art in view of the disclosure herein are also within the scope of this 

25 invention. Accordingly, the scope of the invention is intended to be defined only 
by reference to the appended claims. All documents cited herein are incorporated 
herein by reference in their entirety. ' 
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GGGGTTGTTTGCTTTTTTTCTTGTAAATTTG 1 1 lAAtil 
TCTTTGTAGATTCTGGATGTTAGCCCTTCGTCAGATG 
GATAGATTGCAAAAA 


TMCTATTGGTTCTAGAGAGCAGGACTCsGCii; 1 lAulu 
CAGCATACTGCTTTAAATATATCCATGTCTACATCCAC 
TTTTGTCTGTATGTCTATGTATCTA(T/G)CTATGTATCT 
ATCTAGCTATGTATCTATCTATCTATCTATCTATCATCT 
ATCTATCTATCTATCATCTATCCATCTATCATCTATCAT 

TTATCCATCTAT 


CTTCCCATCTTTACACTGGATGGC3I IUAAI ICitjiiAUia 
AATTACTGGACTCTGGAAGTTGAAGACTGTCCATATA 
ATTAAAATGTACAATAACTACCCAGG(T/G)TTACCTTG 
CAAGTTTCAACATACACAAAATTAACTTTATATGACTC 
TTCAAAAACAGTTTGCCATCATACCTAATAATCTGGTT 

TAAAI 1 1 IAAAAACTC 


TGCCCAGAGTGTGGCTTTAAAAGCTTCCCCATTGC 1 1 

CTCATGTGAAGCCAAGGTTGAGAATGACTAATTTAAG 

GCATTTCTGGTGGATATAAAGGACTA(Cfl")CACAGTCC 

AAGGCCATCCTGACTGACCTCACCTTCCAGGTGCCTA 

GCTCCATCCAGCTGGGCTCCTTTTCAACCCAATTATA 

ACTCTATTAATGTTGTTC 


AGAGTGTGGCTTTAAAAGCTTCCCt/A 1 1 Wi; ll o I cm i 

GTGAAGCCAAGGTTGAGAATGACTAATTTAAGGCATT 

TCTGGTGGATATAAAGGACTACCACA(G/A)TCCAAGG 

CCATCCTGACTGACCTCACCTTCCAGGTGCCTAGCTC 

CATCCAGCTGGGCTCCTTTTCAACCCAATTATAACTC 

TATTAATGTTGTTCCCAGC 


TAATTTAAGGCATTTCTGGTGGATATAAAliijAU I auo 

ACAGTCCAAGGCCATCCTGACTGACCTCACCTTCCAG 

GTGCCTAGCTCCATCCAGCTGGGCTC(C/A)I I I ICAA 

CCCAATTATAACTCTATTAATGTTGTTCCCAGCCAGG 

CATGGTGGCTCATGCCTGTAATCCCAGCACTTTGGGA 

GGCCGAAGCAGGCGGATCA 
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TTATGCACCTTAGTACTCCAGATAATGACCTTCA 1 1 IO 
TTTTCCAATTACCAT 


ATTTTTAGGGAACAAGGGAAAACAACCATAAQGTC; 1 G 

ACTGCCTGCAGGGTCGGGCAGAAAGAGCCATATTTT 1 

CCTTCTTGAGAGAGGCTATAMTGGA(C/G)ATGCAAG 

TAGGGAAGATATCACTAAATTCTTTTCCTAGCAAGGA 

GTATTATTATTAATACCCTGGGAAAGGAATGCATTCCT 

GGGGGGAGGTCTATAAACA 


TAGGGTGGGGAAAAACTCCGCCCTGGTAAATTTG1 G 

GTCAGACCGGTTCTCTGCTGTCGAACCCTGTTTGCTG 

TTGTTTAAGGTGTTTATCAAGACAGTA(C/T)GTGCACC 

GCTGAACATAGACCCTCATCTGTAGTTCTGCTTTTGC 

CCTTTGCCTTGTGATCTTTGTTGGACCCTTATCAGTG 

GTTCTGCTTTTGCCCTTTG 


TACAGCCAGGATTCATGTTACTTTTCATGGAAAAI GG 

GGGCAGTGACTACTGTCCTCCATAAAAGCTGCTGGG 

GAGAATTAGCCTAGCTATTGCAGGCTG(G/A)GATTGC 

TGCTTTCCTGGTGCTATTTCCAGCTACTCAGGCTCAC 

AGGGGCAGTTTTCTACAATGACATTTCAGGGTTGCTG 

ATGAGCCTCCCACTCAGCAG 


CTGGAGGATTTTAAGTATGTAAGTGGAACAA lulGl I 

TTTTTGTTTTTGTTTTTGTTTGAGAAGGAGTTTCGCTC 

TTGTTGCCCTGGCTGGAGTGCAATG(G/A)CATGATCT 

TGGCTCACTGCAACCCCTGCCTCCTGAGTTCAAGTGA 

TTCTCCTGCCTCAGCCTCCAAAATAGCTGGGATTGCA 

GGCGTGTGCCACCATGCC 
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AGTAC 


GTCTGCGTGTATGACGGCTAGACAGGAGTTCAvjAijA 
ACAGCGGGGTCGCCAGGCCACCACCTGATGGGCCA 
CGGCTCATTGGCTCTAGGAGCTGGGAAAG(G/A)GCAT 
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Table 6. Annotation of CYP3A4, CYP17 and SRD5A2 genomic sequences 





Mn no la lion 


Dase pairs 


oud annotation 


base pairs 


CYP3A4 


5' region 


1-10481 








Exon 1 


10482-10642 












5* UTR 


10482-10571 








Start codon 


10572-10574 








Translated region 


10572-10642 




Intron 1 


10643-14574 








Exon 2 


14575-14668 








Intron 2 


14669-16579 








Exon 3 


H^COrt A ^"»^0 O 

16580-16632 








Intron 3 


16633-22072 






■ 


Exon 4 


00070 00H70 

22073-22172 








Intron 4 


22173-24526 








Exon 5 


24527-24640 








Intron 5 


24641-24905 








Exon 6 


24906-24994 








Intron 6 


24995-26259 








Exon 7 


26260-26408 








Intron 7 


26409-27502 








Exon 8 


OTCOO O"7O00 

27503-27630 








Intron 8 


27631-28314 








Exon 9 


28315-28381 








Intrnn Q 
II III UI I 57 










Exon 10 


30737-30897 








Intron 10 


30898-32482 








Exon 1 1 


32483-32709 








Intron 1 1 


32710-33768 








Exon 12 


33769-33931 








Intron 12 


33932-36520 








Exon 13 


36521-37073 












Translated region 


36521-36613 








Stop codon 


36614-36616 








3' UTR 


36617-37073 




3' region 


37074-39071 
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CYP17 



SRD5A2 



5' region 


1-9992 






Exon 1 


9993-10337 










5' UTR 


9993-10040 






Start codon 


10041-10043 I 






Translated region 


10041-10337 


Intron 1 


10338-12009 






Exon 2 


12010-12148 






Intron 2 


12149-12387 






Exon 3 


12388-12617 






Intron 3 


12618-13279 






Exon 4 


13280-13366 






Intron 4 


13367-14193 






Exon 5 


14194-14409 






Intron 5 


14410-14721 






Exon 6 


14722-14891 






Intron 6 


14892-15790 






Exon 7 


15791-15894 






Intron 7 


15895-16416 






Exon 8 


16417-16872 










Translated region 


io4i /- Toby ( 






Stop codon 


iooyo-i o f uu 






O! I ITO 

3 UTR 


lo/U I-IOO f z 


3' region 


16873-26865 






5' region 


1-9995 






Exon 1 


9996-10307 










5' UTR 


9996-10026 






Start codon 


10027-10029 






Translated region 


10027-10307 


Intron 1 


10308-57160 






Exon 2 


57161-57324 






Intron 2 


57325-59454 






Exon 3 


59455-59556 






1 fit rrtn *5, 

in iron o 








Exon 4 


61470-61620 






Intron 4 


61621-64664 






Exon 5 


64665-66344 










Translated region 


64665-64728 






Stop codon 


64729-64731 






3' UTR 


64732-66344 


3' region 


66345-76341 







